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DETAILED ACTION 



Specification 

1 . The disclosure is objected to because of the following informalities: 
On page 22, line 20, "FIG. 7" should be -FIG. 4—. 

On page 24, lines 28 to 29, Applicants should insert the Serial Number of the 
U.S. Patent Application, as the Serial Number cannot be readily determined from the 
information provided. 

Appropriate correction is required. 

Claim Rejections - 35 USC § 102 

2. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

3. Claims 1, 10, 1 1, and 17 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Trompf ('968). 

Regarding independent claim 1, Trompf('968) discloses a method of training in 
speech recognition, comprising: 

"introducing additive noise into a training signal, the additive noise being noise 
that is similar to noise that is anticipated to be present in a test signal during pattern 
recognition" - speech and noise are supplied to the speech recognition device by 
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microphone M (column 3, lines 13 to 15: Figure 1); a sequence of noisy vectors is 
supplied to the neural network N by the first preprocessor (PP1), which in turn receives 
summed speech-free noise signals from the microphone M and noise-free speech 
signal from memory S (column 4, lines 24 to 28: Figure 3); Figure 3 shows a training 
phase, where speech-free noise signals are combined with noise-free speech signals; 

"applying at least one noise reduction technique to the training signal to produce 
pseudo-clean training data" - preprocessing installation PP1 is connected to neural 
network N, which performs the neural noise reduction (column 3, lines 19 to 21: Figure 
1); a first neural noise reduction is performed by mapping the noisy vectors from the first 
preprocessor into noise-free vectors; thus, a noise-reduced vector is present at the 
output of the neural network, which can be noise-free in the ideal case (column 4, lines 
28 to 32: Figure 3); the noise reduced vectors are "pseudo-clean training data" because 
the noise reduced vectors map noisy vectors to noise reduced vectors during the 
training phase of Figure 3, and the noise-reduced vectors only ideally approximate 
noise free vectors; 

"constructing the pattern recognition model based on the pseudo-clean training 
data" - the neural network is trained to perform noise reduction to recognize speech in a 
noisy environment with the noise-free vectors and the noisy vectors; the noise-reduced 
value obtained with the iterative process is now considered trained into the neural 
network N (column 3, lines 49 to 64; column 5, lines 5 to 7: Figure 3); thus, the neural 
network is a "pattern recognition model" based on the noise-reduced vectors ("pseudo- 
clean training data"). 
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Regarding independent claim 11, Trompf('968) discloses training in speech 
recognition, comprising: 

"identifying a type of noise that is expected to be present in a test signal from 
which a pattern is to be recognized" - a differentiation is made between noisy vectors Y, 
which are present in the neural network N at the time of the neural noise reduction, and 
noisy vectors X, which existed in the neural network at a previous point in time; in order 
to form conclusions about the future neural noise reduction of noisy vectors, all the 
previously mentioned information that could be drawn from the noise reduction is used 
to form conclusions about future noisy vectors Z (column 5, lines 10 to 32: Figure 2); 
implicitly, noisy vectors contain noise of the type represented by noisy vectors X and 
noisy vectors Y, which are vectors representing "a type of noise which is expected to be 
present in a test signal"; 

"generating a training signal such that the training signal contains the identified 
type of noise" - speech and noise are supplied to the speech recognition device by 
microphone M (column 3, lines 13 to 15: Figure 1); during training, a sequence of noisy 
vectors is supplied to the neural network by the first preprocessor (PP1 ) (column 4, lines 
21 to 32: Figure 3); 

"reducing the noise in the training signal to produce training data" - reprocessing 
installation PP1 is connected to neural network N, which performs the neural noise 
reduction (column 3, lines 19 to 21: Figure 1); during training a first neural noise 
reduction is performed by mapping the noisy vectors from the first preprocessor into 
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noise-free vectors; thus, a noise-reduced vector is present at the output of the neural 
network, which can be noise-free in the ideal case (column 4, lines 28 to 32: Figure 3); 

"generating the model parameters based on the training data" - the neural 
network is trained to perform noise reduction to recognize speech in a noisy 
environment with the noise-free vectors and the noisy vectors; the noise-reduced value 
obtained with the iterative process is now considered trained into the neural network N 
(column 3, lines 49 to 64; column 5, lines 5 to 7: Figure 3); thus, the neural network 
contains "model parameters based on the training data". 

Regarding independent claim 17, Trompf ('968) discloses a speech recognition 
system, comprising: 

"a pattern recognition model having model parameters formed through a process 
comprising: generating a training signal such that the training signal includes a type of 
noise that is anticipated to be present in the test signal" - during speech recognition 
training, a sequence of noisy vectors is supplied to the neural network by the first 
preprocessor (PP1) (column 4, lines 21 to 32: Figure 3); the neural network is trained to 
have model parameters for speech recognition (column 5, lines 5 to 7); a differentiation 
is made between noisy vectors Y, which are present in the neural network N at the time 
of the neural noise reduction, and noisy vectors X, which existed in the neural network 
at a previous point in time; in order to form conclusions about the future neural noise 
reduction of noisy vectors, all the previously mentioned information that could be drawn 
from the noise reduction is used to form conclusions about future noisy vectors Z 
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(column 5, lines 10 to 32: Figure 2); implicitly, noisy vectors contain noise of the type 
represented by noisy vectors X and noisy vectors Y, which are vectors representing "a 
type of noise that is to be anticipated to be present in the test signal"; 

"reducing the noise in the training signal using a noise reduction technique to 
produce cleaned training values" - preprocessing installation PP1 is connected to 
neural network N, which performs the neural noise reduction (column 3, lines 19 to 21: 
Figure 1); a first neural noise reduction is performed by mapping the noisy vectors from 
the first preprocessor into noise-free vectors; thus, a noise-reduced vector is present at 
the output of the neural network, which can be noise-free in the ideal case (column 4, 
lines 28 to 32: Figure 3); the noise-free vectors are "cleaned training values"; 

"using the cleaned training values to form the model parameters" - the neural 
network is trained to perform noise reduction to recognize speech in a noisy 
environment with the noise-free vectors and the noisy vectors; the noise-reduced value 
obtained with the iterative process is now considered trained into the neural network N 
(column 3, lines 49 to 64; column 5, lines 5 to 7: Figure 3); thus, the neural network is 
based on the noise-reduced vectors ("cleaned training values"); 

"a noise reduction module being receptive of the test signal and being capable of 
applying the noise reduction technique to the test signal to produce cleaned test values" 
- during use of the speech recognition block (I), after the to-be-described training, the 
noisy speech signals are supplied to the preprocessing installation PP1 by the 
microphone; then the noisy speech signals are supplied to the neural network N, which 
performs a noise reduction (column 3, lines 25 to 45: Figures 1 and 3); 
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"a decoder, receptive of features of the cleaned test values and capable of 
accessing the pattern recognition model to identify patterns in the test signal based on 
the cleaned test values" - during speech recognition, following noise reduction, speech 
recognition is performed on noise-reduced speech vectors by speech recognition block 
(I); the speech recognition block (I) uses a standard word recognizer (column 3, lines 13 
to 29: Figures 1 and 3). 

Regarding claim 10, Trompf ('968) discloses a speech recognition process, 
where a noisy speech signal ("a test signal") is received, a neural network N performs 
noise reduction on the noisy speech signal ("to produce pseudo-clean test data"), and 
speech recognition block (I) performs speech recognition on the noise reduced noisy 
speech signal (column 3, lines 10 to 47: Figures 1 and 3). 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 2 to 9, 12 to 16, and 18 to 29 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Trompf ('968) in view of Sameti et al. ("HMM-Based Strategies 
for Enhancement of Speech Signals Embedded in Nonstationary Noise"). 

Concerning claim 2, Trompf C968) omits applying a plurality of noise reduction 
techniques. However, Sameti et al. discloses several speech enhancement methods, 
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including spectral subtraction and HMM-based enhancement methods. (Pages 446 to 
449: II. Speech Enhancement Methods) It is stated that the noise adaptation algorithm 
is adapted to handle arbitrary types of corrupting noise by selecting a noise model, and 
switching to a model representing a new noise type if required. Sameti et a/, suggests 
the method of noise model selection can successfully cope with noise level variation as 
well as different noise types, and keeps the noise model sufficiently compact so that 
excessive computation in enhancement is avoided. (Pages 449 to 450: III. Noise 
Adaptation Algorithm) In particular, MAP and AMAP noise reduction algorithms were 
only used for white noise, while spectral subtraction and MMSE methods were used to 
handle all three types of noise. (Page 552: Left Column, Last Paragraph) It would have 
been obvious to utilize a plurality of noise reduction techniques by selecting among 
noise models as taught by Sameti et al. in the noise reduction training method as 
disclosed by Trompf ('968) for the purpose of coping with noise level variation and 
avoiding excessive computation. 

Concerning claims 3 to 5, 14 to 16, and 18 to 20, Sameti et al. discloses a 
speech enhancement method adapted to handle arbitrary types of corrupting noise by 
selecting a noise model, and switching to a model representing a new noise type if 
required. (Pages 449 to 450: III. Noise Adaptation Algorithm) In particular, MAP and 
AMAP noise reduction algorithms were only used for white noise, while spectral 
subtraction and MMSE method were used to handle three types of noise, i.e. white 
noise, helicopter noise, and recorded multi-talker party noise. (Page 552: Left Column, 
Last Paragraph) Thus, spectral subtraction can be applied to all the sets of noise, while 
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MAP and AMAP algorithms are only applied to white noise, but not helicopter noise or 
multi-talker party noise. Sameti et al. suggests the method of noise model selection can 
successfully cope with noise level variation as well as different noise types, and keeps 
the noise model sufficiently compact so that excessive computation in enhancement is 
avoided. (Pages 449 to 450: III. Noise Adaptation Algorithm) It would have been 
obvious to utilize a plurality of noise reduction techniques by selecting among noise 
models as taught by Sameti et al. in the noise reduction training method as disclosed by 
Trompf ('968) for the purpose of coping with noise level variation and avoiding 
excessive computation. 

Concerning claims 6 to 9 and 21 , Sameti et al. discloses a noise enhancement 
method that compares each noisy signal ("receiving and sampling noise in the test 
signal"), calculates a likelihood for the received noisy signal and each pretrained noise 
HMM ("identifying a separate probability"), and selects a noise model based on the 
likelihood ("selecting a probability"). (Page 552: Left Column: First Paragraph: Figure 6) 
Then, based on the type of noise identified ("identifying the set of noisy training data"), 
MAP and AMAP algorithms are only applied for white noise, but spectral subtraction 
and MMSE methods are applied for all types of noise ("applying the noise reduction 
technique"). (Page 552: Left Column: Last Paragraph) 

Concerning claims 12 and 13, Sameti era/, suggests noise enhancement 
involving recorded multi-talker party noise. (Page 552: Left Column, Last Paragraph) 
Trompf ('968) discloses separately supplying and summing speech-free noise signals 
from the microphone M and noise-free speech signals from memory S (column 4, lines 



Application/Control Number: 09/688,950 Page 10 

Art Unit: 2654 

24 to 28: Figure 3). Recorded speech and noise samples are an art-recognized 
alternative to live speech and noise samples. Thus, it would have been obvious to one 
having ordinary skill in the art to supply the summed speech-free noise signals and 
noise-free speech signals of Trompf('968) as recorded noise and speech, respectively, 
because recorded samples are art recognized alternatives to live samples as suggested 
by Sameti et a/. 

Concerning claims 22 to 26, Sameti etal. discloses a noise enhancement 
method that compares each noisy signal, (Page 552: Left Column: First Paragraph: 
Figure 6) Then, based on the type of noise identified, MAP and AMAP algorithms are 
only applied for white noise, but spectral subtraction and MMSE methods are applied for 
all types of noise. (Page 552: Left Column: Last Paragraph) The selected noise 
models are "a pattern recognition model" and "a second pattern recognition model", as 
applied to the claim limitations discussed above. 

Concerning claims 27 to 29, Sameti et al. discloses speech recognition is 
performed to recognize features, phonemes ("a sub-word acoustic unit"), or words. 
(Page 446, Right Column) Implicitly, a standard speech recognizer recognizes a series 
of words ("a string of words") from individual features, phonemes, and words. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lemer whose telephone number is (703) 308- 
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9064. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (703) 305-9645. The fax phone 
number for the organization where this application or proceeding is assigned is 703- 
872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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6/21/04 




Martin Lerner 
Examiner 
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